Generation of arbitrarily two-point correlated random networks 
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Random networks are intensively used as null models to investigate properties of complex net- 
works. We describe an efficient and accurate algorithm to generate arbitrarily two-point correlated 
undirected random networks without self- or multiple-edges among vertices. With the goal to sys- 
tematically investigate the influence of two-point correlations, we furthermore develop a formalism to 
construct a joint degree distribution P(j, k) which allows to fix an arbitrary degree distribution P(k) 
and an arbitrary average nearest neighbor function k nn (k) simultaneously. Using the presented algo- 
rithm, this formalism is demonstrated with scale-free networks (P(k) oc fc -7 ) and empirical complex 
networks {P(k) taken from network) as examples. Finally, we generalize our algorithm to annealed 
networks which allows networks to be represented in a mean- field like manner. 

PACS numbers: 89.75.Hc, 05.40.-a 



I. INTRODUCTION 

The fast developing research field of complex networks 
[l|, 0] focuses on the three main aspects of (i) measur- 
ing network topology, (ii) investigating dynamics on net- 
works, and (iii) studying the interplay between dynamical 
processes on networks and the network topology. Surpris- 
ingly, empirical networks from a vast variety of scientific 
fields share a lot of characteristical features. Prominent 
examples are the small- world property 0] , high clustering 
Q , and the scale-free degree distribution [f|. One possi- 
bility to unravel the properties of empirical networks is to 
compare them to null models. Appropriate null models 
are random networks with some of the statistical features 
preserved being present in the empirical network under 
investigation. This idea gave birth to the well-known con- 
figuration model (CM) algorithm [|, 0, H H which is 
capable of generating random networks with an a priori 
given degree distribution. Some extensions to this model 
have been proposed to even conserve some further sta- 
tistical properties than the plain degree distribution, for 
instance the degree dependent clustering coefficient [111 ]. 

A fundamental way to categorize and distinguish em- 
pirical networks beyond the degree dis tribution and clus- 
tering has been proposed by I Newman! 12l[l3| who intro- 
duced the Newman factor r. This number is basically the 
Pearson correlation coefficient of degrees (the number of 
edges emanating from a vertex) from connected vertices 
in a network and is therefore fully defined by two-point 
correlations in a network. The range of the Newman 
factor is in the interval [—1, 1] where positive (negative) 
values indicate that vertices with the same (different) de- 
gree tend to be connected, while a value of means no 
correlation. Practically all empirical networks show a 
non-trivial two-point correlation structure. An astonish- 
ing observation is, for example, the fact that biological 
networks show negative Newman factors, while techno- 
logical networks display rather small values of the New- 
man factor close to zero, whereas social networks tend 
to have rather large positive values [l|J|. The evident 
importance of correlations within the degree distribution 



has led to lots of efforts, for example a hidden variable 
approach has been developed in Ref. [l5| and so-called 
dK-series networks which systematically describe the full 
correlation structure of a network have been introduced 
in Ref. [1 €31 ] together with an algorithm for the lowest 
dftT-classes. Thus, an efficient random network genera- 
tor which constructs null model networks at the basis 
of an a priori prescribed two-point correlation structure 
is very important. Such a generator is presented below 
and allows to construct undirected random networks with 
a prescribed two-point correlation structure and hence 
much more realistic null models. The major advantage 
of our generator in comparison with similar algorithms 

previously introduced [H, [H, [13 ^ s ^ ts m S n accurac y an d 
the generality of the approach which allows to construct 
networks with an arbitrary two-point correlation struc- 
ture. As an application of this scheme and in order to 
investigate the influence of two-point correlations within 
empirical networks, we address the question how one can 
model two-point correlations while preserving the degree 
distribution of a network. This is fundamental, for in- 
stance, in order to shed light on the interplay between 
dynamical processes on networks on the underlying net- 
work topology with respect to two-point correlations. 

The modeling of two-point correlations is especially 
interesting for the verification of theoretical predictions 
from theories describing dynamical processes on networks 
which do incorporate two-point correlations. Due to 
the small- world effect present in networks, it is com- 
mon use to utilizes a mean-field (MF) ansatz. Hence, 
within these theories the network is modeled using a 
probabilistic approach and vertices are only connected 
with a certain probability to each other. The idea to 
represent a network by probabilities has already been 
brought up in the context of Kauffman's model of ran- 
dom complex automata [H, . This so-called annealed 
network changes in every time step such that all edges 
are redistr ibuted. A similar app roach has recently been 
applied by IStauffer and Sahimil to scale- free networks to 
study the effect of 'annealed disorder' on a diffusion pro- 
cess [2(j- Such annealed networks are ideally suited to 
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test the validity of MF theories of dynamics on networks. 
We extend this approach below by generalizing our algo- 
rithm to allow for the construction of two-point corre- 
lated annealed networks. 

This paper is organized as follows: Section II intro- 
duces the network correlation measures used in this pa- 
per. Section III describes the algorithm to construct ar- 
bitrarily two-point correlated networks. Section IV de- 
velops a formalism which allows to fix a degree distribu- 
tion and to arbitrarily choose the two-point correlations 
at the same time. The formalism is demonstrated with 
scale-free networks and empirical networks as examples. 
Section VI introduces the notion of a two-point correlated 
annealed network. We conclude and give an outlook in 
section VII. 



II. CORRELATION MEASURES 

The following is a short summary of common defini- 
tions adapted to our purposes which will be used fre- 
quently within this paper. Two-point correlations are 
statistically described by the joint degree distribution 
P(j, k) which is the probability that a randomly chosen 
edge of the network has vertices with degrees j and k at 
its ends. This distribution is a symmetric function in the 
case of undirected networks, P(j, k) — P(k,j). By sum- 
mation over either parameters of P(j, k), one obtains the 
distribution over edge ends, 

P c (k) = J2 P (^ k ^ (!) 



which is related to the distribution of vertices by 

P(k) = ^P c {k). (2) 

This last relation |(3J) between the edge end distribu- 
tion P c (k) and the degree distribution P(k) can easily 
be understood by the fact that every vertex with degree 
k has probability P{k) of being drawn at random from 
the network. Therefore, the probability to draw an edge 
end connected to a vertex of degree k is proportional to 
kP(k). Normalizing this last expression yields the edge 
end distribution P c (k) = kP(k)/k. Here, k = J2k kP ( k ) 
denotes the mean with respect to the degree distribu- 
tion P(k). This mean has to be carefully distinguished 
from the mean with respect to the edge end distribution 
P c (k) which we denote by (k) = J2 k k P e (k) = k 2 /k. It 
is convenient [2l| to extract the actual correlations from 
P(j, k) by relating it to the uncorrelated case P\jc(j, k), 
which has the special product form 



P uc {j,k) = P e (j)P e (k)- 



(3) 



By taking the ratio between P{j,k) and P\jc{j,k), this 
defines 



f(j,k) 



PU,k) 
Pvc(j,k) 



(4) 



as a correlation function. 

However, the joint degree distribution P(j, k) and the 
correlation function f(j,k) are complex functional ob- 
jects which are hard to imagine. A way to quantify the 
overall correlation present in a network was introduced 
by Newman He defined the Newman factor r to be 
the Pearson correlation coefficient of the remaining de- 
grees of two vertices at either ends of a randomly chosen 
edge. The use of the remaining degree, which is the ac- 
tual degree of a vertex minus one, is only an arithmetic 
trick to suppress some terms in calculations performed 
by Newman. In this paper, we directly use the degrees of 
the vertices, which is equivalent to Newman's definition 
in the limit of large networks, 
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^jk(P(j,k)-P e (j)Pe(k))- 



(5) 



The Newman factor r is normalized by a 2 = (k 2 ) — (k) 2 
to fall into the range [—1, 1]. A positive (negative) value 
means that vertices with a degree k preferentially attach 
to vertices with a degree of the same (different) order 
which is referred to as (dis-)assortative mixing. The spe- 
cial case of r = is achieved in the case of no correlation, 
which can be seen by substituting P\jc(j,k) of Eq. ([3]) 
into Eq. (J5]) . It is clear that the Newman factor r quanti- 
fies the correlations present in a network only on a global 
scale. An intermediate approach, being on the level of de- 
grees, has been introduced in Ref. [22| with the average 
nearest neighbor function k nn (k). Using the conditional 
probability 



P{j\k) 



P(], k) 
P c {k) ' 



(6) 



which is the probability that a randomly chosen neighbor 
of any vertex with degree k has the degree j, one defines 
k nn (k) to be 



k nn (k) = '£jP(j\k). 



(7) 



In the case of an (dis-)assortative network the average 
nearest neighbor k nn (k) has to be an (de-)increasing func- 
tion, while it has the constant value (k) for uncorrelated 
networks. It is interesting to note that 



(k nn (k)) = (k) 



(8) 



is generally valid, which can be seen by plugging Eq. ^ 
into Eq. ((7|) and averaging the resulting equality over k 
with respect to the edge end distribution P c (k). 



III. ALGORITHM 



The well-known CM algorithm [H, 0, HI fixes a priori 
a degree sequence which is usually drawn from a given 
degree distribution P(k). Each element of this degree 
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sequence is the number of desired edges emanating of a 
vertex. These may be thought of as half-edges which still 
need to be joined with half-edges of other vertices. To 
construct the network, the CM algorithm may be imple- 
mented by placing all half-edges of all vertices into a sin- 
gle list, which is a discrete representation of the edge end 
distribution P e (k). An edge is formed by selecting two 
random members of that list. If the constraint of neither 
self- nor multiple-edges is met, the edge is created and 
the two half-edges are removed from the list. As the first 
and the second draw is done from the same list or, equiva- 
lently, each draw is done independently with the edge end 
distribution P e (k), the resulting network is always uncor- 
related. Only the constraint of self- and multiple-edge 
prevention induces some intrinsic correlations, which can 
be avoided if the maximal degree k max is limited (cf. sec- 
tion IIV A[) . The CM algorithm paired with the correct 
choice of the maximal degree k max is as well known as 
the uncorrelated CM (UCM) algorithm 10]. However, 
almost all empirical networks do display two-point corre- 
lations in their topology. The algorithm discussed below 
allows to fix a priori an arbitrary joint degree distribu- 
tion P(J, k) and generates a network which is completely 
random under all other topological aspects, just as the 
CM algorithm does with respect to the degree distribu- 
tion P(k). 

A major computational complication arises from the 
fact that probabilities in the P(j, k) matrix may become 
very small as the probability for one edge is of the order 
1/kN and computationally hard to handle for large N. 
Due to this problem, we sample in a first step a half-edge 
with the usual edge end distribution P e (k), in a second 
step, we sample a half-edge from the conditional prob- 
ability distribution P(j\k). The former two objects are 
much easier to sample as those are the result of integrals 
over P(j, k) and therefore contain probabilities of greater 
order. 

The overall scheme of the algorithm to construct a net- 
work with N vertices and a given joint degree distribution 
P(j, k) is the following: 

1. As in the CM algorithm, one first has to draw a 
degree sequence by calculating the theoretical (con- 
tinuous) edge end distribution P e (k) from the joint 
degree distribution P(J, k) and transform that into 
a degree distribution P(k). From this distribution, 
a degree sequence of length N is drawn. 

2. Each element of the degree sequence represents a 
vertex. All vertices with the same degree k are then 
sorted into degree classes, each containing only ver- 
tices of the same degree k. 

3. To compensate for discretization effects caused by 
the finiteness of the sampled network, one has to 
calculate the discrete edge end distribution P^ 1 (k) 
from the generated degree sequence. To do so, 
one acquires, by estimating the size of each de- 
gree class, the discrete degree distribution P( d '(fe), 



which corresponds to a discrete edge end distribu- 
tion by P G (d) (fc) = kpW{k)/k. 

4. Next, the discrete conditional probability P^'(j\k) 
is setup. To obtain a matrix which accommodates 
the discretization effects, one replaces the continu- 
ous edge end distributions P c (k) in the definition of 
the conditional probability distribution of Eq. © 

by the discrete edge end distributions P e (fc) and 
obtains therefore 

p(m = ^^-=p e (j)fu,k) 

c( ' (9) 

Since we mix the discrete edge end distribution 
Pe (J) an d the continous correlation function 
f{j,k), the resulting conditional degree distribu- 
tion P( d )(j|fc) is only approximately normalized for 
a given degree class k. To obtain a conditional 
probability distribution suitable for sampling de- 
gree classes, we normalize each degree class sepa- 
rately, leading to the final form 

(10) 

This definition is consistent with the limes 
N — » oo, as the discrete edge end distribution 
Pe (j) becomes equal in this limit to the conti- 
nous edge end distribution P c (j) and the ratios 
P e (j) I PJS) become exactly 1, respectively. 

5. After all base data structures have been initialized, 
the algorithm starts to draw edges by drawing edge 
ends. The first edge end is selected by first draw- 
ing a degree class k from the edge end distribution 
Pe d \k) and then randomly choose a vertex from 
that degree class. 

6. The second end of the edge is chosen in the same 
two step manner. However, the first draw of a 
degree class j is done with the appropriate condi- 
tional probability distribution P^ d '(j\k) instead of 

the edge end distribution Pc d \k). This construc- 
tion scheme yields correctly correlated graphs, since 
we have 

P e (k) P{j\k) = P(j,k). (11) 

1. draw 2. draw 

An edge is created whenever the constraints of nei- 
ther self- nor multiple-edges is met. Otherwise the 
drawn edge is rejected and the algorithm continues 
with step five. 
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7. If the edge is created, the probability weights of the 
two edge ends are removed from the corresponding 

degree classes in the edge end distribution Pe d \k) 
and the conditional probability distribution matrix 
P The removal of the probability weight is 

equivalent to the removal of the two half-edges from 
the list of eligible half-edges in the CM algorithm. 

8. The steps five to seven are repeated until no edge 
ends are left and all edges are formed. 

The principal numerical costs of the algorithm arises 
from the continuous sampling of degree classes in the 
steps five and six above. Since the algorithm has to 
sample only the degree classes actually realized, which 
is a significant lower number than the system size N, 
the numerical costs are of the order 0(N a ) with a < 1. 
Furthermore, due to the removal of probability weight of 
used half-edges throughout the construction procedure, 
the algorithm samples only the possible configuration 
space which remains valid in each iteration step just as in 
the CM algorithm. The memory usage of the algorithm 
scales with the square of the number of realized degree 
classes. This can become a significant advantage over 
the CM procedure as described above, since the memory 
usage of the CM procedure scales with the number of 
half-edges needed to construct the network. 

To validate our algorithm, we use three empirical net- 
works as test cases: (i) a social network where the 
392, 340 vertices are actors and the edges between those 
are assigned if they performed in at least one movie 
together 0]; (ii) a subset of the WWW containing 
325, 759 web pages which are connected if there ex- 
ists a link among them (23|; (iii) the yeast protein- 
interaction network constituent of 1,846 proteins [24j. 
The data has been downloaded from Barabasi's web site 
|http : / / www . nd . edu/'^ networks All self- and multiple- 
edges were removed from each network. The actor net- 
work is assortatively (r = 0.27), the WWW network 
weakly (r — —0.053) and the yeast protein-interaction 
network disassoartively (r = —0.16) correlated. To test 
the correctness of the algorithm, one measures the joint 
degree distribution P re f (j, k) of the base networks and 
uses this function as input for the construction algo- 
rithm. The resulting random network has to display the 
same degree distribution P(k) and joint degree distri- 
bution P(j, k) as the empirical one. A very sensitive 
test to validate if the correlation structure of the refer- 
ence and the random network indeed match is on the 
level of the correlation function f(J, k) which varies on 
a much smaller scale than the joint degree distribution 
P(j,k). Thus, comparing the reference correlation func- 
tion / ro f(j, k), which one obtains from the empirical net- 
work, with the correlation function f(J, k) of the network 
as generated by the algorithm by means of a correlation 
coefficient (1 means total agreement, —1 indicates that 
the two functions are of opposite sign and means no 
correlation among the two functions in comparison) re- 
veals almost complete agreement of (i) 0.99(6) (ii), 0.9(9), 
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FIG. 1: (color online) Density plot of the correlation func- 
tion /ref (i, k) of the empirical network versus the correlation 
function f(j, k) of the corresponding random network as gen- 
erated by the algorithm for all indices j and k. Darker red 
regions contain a higher density of data points, while lighter 
red indicates a lower density. The reference line y = x is 
drawn as a guide to the eye. 
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FIG. 2: (color online) Degree distribution P(k) of empirical 
networks and their corresponding degree distribution as gen- 
erated by the algorithm. The red squares denote the reference 
points as measured from the empirical networks and the black 
circles mark values measured from the randomized networks. 



and (iii) 0.99(8). A density plot of the reference correla- 
tion function versus the resulting correlation function in 
Fig. [1] verifies the excellent agreement of the correlation 
functions f(j, k) and f Te {(j, k). The plot shows the corre- 
sponding values of /(j, k) versus fre{(J> k) f° r an indices j 
and k at either axis. Ideally, all data-points would be on 
the diagonal which would be the case if the two functions 
were identical and the density plot would show a delta- 
shaped line along the diagonal. As one can see from the 
plots, the highest density of points, which is indicated 
by darker red, is almost solely centered at the diagonal. 
Just as the correlation functions coincide, the degree dis- 
tributions show the same very good agreement, which is 
illustrated in Fig. [5] The statistics per curve are 10 2 ran- 
domized realizations for the actor-, 10 3 for the WWW- 
and 10 4 for the yeast-network in both figures. 
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IV. CONTROLLING CORRELATIONS IN 
NETWORKS 

The algorithm described in this paper constructs undi- 
rected random networks with an arbitrary two-point cor- 
relation structure. This allows us to test explicitly the 
influence of two-point correlations present in a network 
on its properties. For example, being able to control the 
two-point correlation structure of a network allows to di- 
rectly test their influence on dynamical processes taking 
place on the networks. We therefore aim at developing a 
formalism which allows to control the two-point correla- 
tions of a whole network in terms of the average nearest 
neighbor degree k nn (k) and the Newman factor r, given 
a fixed degree distribution P(k). 

As we want to preserve a given degree distribution 
P(k), which translates into a given edge end distribu- 
tion P e (k), while varying the joint degree distribution 
P(j,k), some restrictions apply to the joint degree dis- 
tribution. We begin with an ansatz by writing the joint 
degree distribution P(j, k) in product form as in Eq. ([4]), 



which means 



P(j,k) = P e (j)P e (k)f(j,k). 



(12) 



It is clear that the correlations in the network are encoded 
by this ansatz within the correlation function f(j,k). 
The relation to the Newman factor r from the definition 
Eq. © is 

rof = {jk(f(j,k) - l))j, k = (jkf(j,k))^ k -{k) 2 . (13) 

By the notation {-)j,k, we indicate that the average with 
respect to P c (k) is to be taken simultaneously over the in- 
dices j and k, similarly as (•) denotes the average with re- 
spect to P e (k). The correlation function f(j, k) is as well 
tightly connected to the average nearest neighbor degree 
function fc nn (fc). Using that the conditional probability 
P{j\k) = P(j,k)/P c (k) = P e (j)f(j,k), the definition of 
Eq. ([7]) turns into 



k nn {k) = (j f(j,k))j 



(14) 



Multiplying the average nearest neighbor function k nn (k) 
with k P c (k) and summing over all k, we are lead to 



{kk nn (k)) = (jk f(j,k))j, k , 



(15) 



which we can substitute into Eq. (|13[) . leading us finally 
to 



= (kk nn (k))~{k)' 



(16) 



From the constraint of a given degree distribution P(k) 
it follows that an integration over either argument of the 
joint degree distribution P(j, k) has to be equal to the 
corresponding edge end distribution P c (j) (or P c (k)). 
Thus, the correlation function f(j, k) has to fulfill the 
condition, 



(17) 



</(j,fc))i = l. 



(18) 



The considerations so far are general. However, as we 
want to control correlations within the network, we seek 
for an explicit correlation function /(j, k) which has the 
property of Eq. (fT8|) and produces a joint degree distribu- 
tion which yields a given average nearest neighbor degree 
km\{k) function. To do so, we make a simple ansatz for 
the correlation function 



f(j,k) = l + h(j)h(k). 



(19) 



This functional form may be understood as a series ex- 
pansion of first order, fulfilling the necessary symmetry 
property that the correlation function has to be constant 
under exchange of indices j and k. Plugging this ansatz 
into Eq. (jT4j) takes us to 



(20) 



knn(k) = (k) + (jh(j))h(k), 

which means that 

fcnn(fc) - (k) 



h(k) 



(Jh(j)) 



(21) 



The constant {j h(j)) can easily be calculated by multi- 
plying Eq. (|2"Tj) with kP e (k) and summing over all k. 
Rearranging the terms then yields 



(k h(k)) = ^(kk nD (k))-(k)2 = V^F- (22) 
Finally, the correlation function f(j, k) has the form 

= 1 + (23) 

r a* 
Employing condition (TT5)) to the ansatz in Eq. (fTTil) yields 



(Hj)) = 0. 



(24) 



This property is consistent with the functional form of 
h(k) in Eq. (j2"Tj) , since the average of h{k) over k with 
respect to the edge end distribution P c (k) yields zero 
by usage of Eq. © ((k nn (k)} = Eq'. © helps 

furthermore to construct valid average nearest neighbor 
functions k nn (k) with an arbitrary functional dependence 
upon the degree k. Taking a sufficiently smooth and pos- 
itive weighting function <?(fc), the corresponding k nn (k) 
compatible with Eq. © is then 



k n n{k) 



(k) 
(9(h)) 



g{k). 



(25) 



However, the resulting correlation function / 



constrained by even further conditions [25 
example, the ratio Tj & as introduced in Ref. 
as the actual number of connections E 
divided by the maximal number of 



is still 
For 

27[ is defined 

J>k (= P(j,k)kN) 
connections rrij.k 
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among the degree classes j and k. For networks with- 
out multiple edges this ratio is given by 



E 



P(j,k) 



m hk min{P c (j), P e {k),kN P e {j)P e {k)/ jk} 

(26) 



It is clear that this ratio must always be in the range 
between and 1 for all valid degree classes j and k present 
in the network, 



< r jt k < 1 V j, k £ [fc min , fc ma J. 



(27) 



From this condition the admissible degree range 
[fcmin i fcmax] becomes dependent upon the details of the 
correlation function /(j, k). To proceed, we choose as an 
example the average nearest neighbor function to be a 
power law k nn (k) oc k a , as this functional form roughly 
approximates the measured average nearest neighbor 
function of various empirical networks. Using this ansatz, 
one obtains the final form of the correlation function as 



f(j,k) = l + 



1 



1 



(fc«+ 1 )/(fc) - (k a ) {k a ) 



u a ~(k a ))(k a ~(k a )). 

(28) 



Up to this point the degree distribution P(k) or equiva- 
lently the edge end distribution P c (k) is still arbitrary as 
the former does only enter Eq. (|28|) via the averages (•) 
used in the definition of the correlation function /(j, k). 
Nevertheless, the range of the exponent a is limited, since 
condition of Eq. (127)) has to be fulfilled. A further com- 
plication arises from intrinsic correlations caused by the 
constraint of the absence from self- and multiple-edges. 
In the following we discuss these issues for scale-free net- 
works and empirical networks in detail. 



A. Scale-Free Networks 

The degree distribution P(k) of a scale-free network is 
defined by 



P(k) oc k-t, 



(29) 



where 7 is the scale-parameter. The edge end distribu- 
tion is therefore given by 



P e (k) oc k 



-7+1 



(30) 



As we only discuss finite networks, the range of admis- 
sible degrees k is limited by various conditions. First, 
the rapidly decreasing probability for increasing degrees 
k requires to cut-off the degree range at a maximal degree 
fcmax above which the accumulated probability weight is 
equal to 1/N. This yields the so-called natural cut-off 



jLnatural _ at1/(7-1) 



This natural cut-off is necessary to prevent large fluc- 
tuations in a finite random network ensemble and is an 
upper limit for the maximal degree k max . It is important 
to emphasize that this cut-off is by no means induced by 
the topology of the complex network. 

However, it turns out that the natural cut-off is not 
always compatible with the condition of Eq. (|27|) . which 
can easily be used to determine the so-called structural 
cut-off. In the case of scale-free networks, Eq. (|2"6")l re- 
duces for sufficiently large degrees j and k to r - t k = 
jk f(j, k)/kN and defines therefore a maximal degree 
fcmax at the upper bound for the ratio (j"fc max .fc max = !)• 
With this criteria, one obtains, in the case of uncorrelated 
networks having a constant correlation function f(j, k) = 
1, the scale-parameter independent cut-off ^™ tuml oc 
N 1 / 2 . This is smaller than the natural cut-off for values 
of the scale-parameter in the ra nge 2 < 7 < 3 . N ev- 
ertheless, newer calculations by iDorogovtsev et all [26j 
reveal that this structural cut-off is still too large in that 
particular range of the scale-parameter 7 and causes in- 
trinsic correlations to arise within otherwise uncorrelated 
networks without self- or multiple-edges. Due to the 
maximal degree fc m ax being too large and the required 
constraints, the vertices with large degrees k do have a 
tendency to connect preferably with low degree vertices 
which effectively yields disassortativity. The reason for 
the failure of condition (|27|) in the case of scale-free net- 
works with a scale-parameter 7 in the range (2, 3] can be 
seen in the diverging fluctuations in the degree distribu- 
tion as only the first moment of the degree distribution 
P(k) is finite. The approach taken by IDorogovtsev et alj 
is based upon a statistical ensemble ansatz. A canonical 
network ensemble is defined as the set of networks with a 
fixed set of vertices and a fixed number of edges. The final 
networks are then the out-come of an evolution process 
where randomly chosen edges are removed and simulta- 
neously added to a pair of vertices in the network. The 
pair of vertices is chosen at random with weights given 
by the product of a preferential function f(j) f(k) where 
j and k are the degrees of the respective vertices. With 
the preferential function f(k) = k + I — 7 and beneath 
the critical temperature, the authors observe that the de- 
gree distribution becomes scale-free. However, depending 
upon the fin iteness of the secon d moment of the degree 
distribution, IDorogovtsev et ai] find different cut-offs of 
the degree range 



'ensemble 



' N 1 ' 2 if 7 > 3 
JVV(s-7) if 2 < 7 < 3. 



(32) 



(31) 



The evolution process driving a network into this equi- 
librium network is, of course, neither the same as con- 
structing a network with the CM algorithm nor with the 
algorithm developed in this paper. The CM algorithm 
and the algorithm presented in this paper, however, fix a 
priori the number of vertices and edges as well, just as in 
the canonical network ensemble. Thus, both algorithms 
can be interpreted to produce graphs which are members 
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of the canonical network ensemble below the critical tem- 
perature, since both approaches evidently yield random 
networks with the correct degree distribution. 

Up to this point, we have only treated the uncorre- 
cted case which corresponds to a = in Eq. ([28]) . Nu- 
merical experiments indicated a strong deviation from 
the expected power-law for the measured average near- 
est neighbor fc nn (fc) function in the case of assortative 
networks which have a > 0, if one naively uses a cut-off 
as it is applicable for uncorrelated networks. The av- 
erage nearest neighbor function shows that the vertices 
with the largest degree fall below their expected aver- 
age nearest neighbor value and tend therefore to cause 
some degree of disassortivity. This effect roots in the 
constraint of the prevention of self- and multiple-edges 
and becomes stronger for larger values of the exponent 
a. To compensate this effect, we incorporated the expo- 
nent a in the exponents of the maximal degrees identified 
so far in a simple way (an analytically exact derivation 
is beyond the scope of this paper) and always use the 
minimal resulting maximal degree, 

fcmax - min^ 1 -^ 5 -^ , ATd-^-i), #1/(7-1)}. 

(33) 

Using a maximal degree of this form lowers (raises) the 
cut-off degree for assortative (disassortative) correlations 
with increasing (decreasing) exponent a. Having fixed 
the maximal degree fc mal , we set the minimal degree 
k m in to be 2 in all simulations. This ensures that we 
always obtain a largest giant component in the network 
having almost the size of the entire network, which in 
turn guarantees that the largest giant component has the 
same two-point correlation structure as the entire net- 
work. This is favorable, since in most applications only 
the largest component of the generated random networks 
is of interest. 

As already pointed out, it is crucial to note that only 
the first moment of the degree distribution is finite for 
values of the scale-parameter 7 in the range (2, 3] while 
all higher moments diverge. However, already the first 
moment of the edge end distribution P e (k) is diverging 
in this range of the scale-parameter 7. This has the 
important consequence that the average nearest neigh- 
bor function fc nn (fc) becomes system size dependent, as 
(fc nn (fc)} = (k) by Eq. ©. To validate the predicted 
power-law behavior of the average nearest neighbor func- 
tion fc nn (fc), we employ a dimensionless data-collapse of 
the function, 

k nn (k) k- a (k a )/(k) = 1. (34) 

This type of plot is extremely sensitive even against 
smallest deviations from the predicted power-law in the 
average nearest neighbor function fc nn (fc). The numer- 
ical results for various values of the scale-parameters 7 
and the exponent a are shown in Fig. [3] for networks of 
size N — 10 6 . Each data point is calculated over an en- 
semble of 10 3 random networks. The curves run quite 
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k 

FIG. 3: (color online) Data-collapse for average nearest neigh- 
bor function k n n(k) oc k a with various values of the power 
parameter a for networks with a scale-free degree distribu- 
tion with varying values of the scale parameter 7. The sym- 
bols used for the different values for the scale-parameter a 
are: blue circle —0.2, pink square —0.1, dark green triangle 
up 0.0, red diamond 0.1, yellow triangle down 0.2, and light 
green star 0.3. 

nicely along the predicted constant line of 1. Especially 
the a = curves coincide with the constant line of 1, 
which is a further, very important validation of the al- 
gorithm, since in this case the algorithm has to coincide 
with the well-known UCM algorithm [10]. Three details 
are interesting to note: (i) with decreasing a the curves 
become longer as the maximal degree fc m ax increases, (ii) 
not all values of the exponent a can be realized for a given 
value of the scale-parameter 7 as condition rj t k > is 
violated for some curves and would require a further ad- 
justment of fcmax or even fc m ; n , (hi) with increasing scale- 
parameter 7 the curves for larger values of the exponent 
a show a trend to slightly bend below the constant line 
of 1 which is an indication that the cut-off as of Eq. ([33]) 
still gives slightly too large values for the maximal degree 
fcmax- Another test of our formalism can be accomplished 
by comparing the Newman factor r of the resulting net- 
works to the values of the analytically predicted ones by 
Eq. p3[) . The Fig. 2] shows that numerical simulations 
(points) and theoretical predictions (lines) coincide very 
well. 

The diverging moments (fc) and (k a+1 ) of the edge 
end distribution P c (fc) for values of the scale-parameter 7 
within the range (2, 3] make a careful inspection of finite- 
size effects necessary. One can easily see that the ratio 
(fc Q+1 )/(fc), appearing in the denominator of the corre- 
lation function f(j,k) in Eq. (|2"8|) , diverges, as the ratio 
becomes proportional to fc ma x- Nevertheless, a detailed 
calculation reveals certain restrictions on the maximal 
range of admissible degrees fc if a is chosen to be differ- 
ent than 0. In this case, the criterion > leads to a 
relation between the minimal degree fc m in and the maxi- 
mal degree fc m ax- Thus, the range of admissible degrees 
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FIG. 4: (color online) Newman factor r as a function of the 
scale-parameter a for different values of the scale-parameter 
7. The straight line denotes the theoretic values of the New- 
man factor r as of Eq. (|16[l . The symbols denote the value of 
the scale-parameter 7: blue circle 2.0, pink square 2.4, dark 
green triangle 2.8, and red diamond 3.2. 
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FIG. 5: (color online) Network size dependence of the New- 
man factor r as a function of the exponent a for different val- 
ues of the scale-parameter 7. The network size N is marked 
by the symbols: blue circle 10 4 , pink square 10 5 , and dark 
green triangle 10 6 . 



is limited and the moments (k) and (k a+1 ), which would 
otherwise diverge, remain finite. Fig. [5] shows the finite- 
size effects on the Newman factor r as a function of the 
exponent a. The plot shows only a marginal effect of the 
system size N on the curves. However, for smaller sizes, 
a broader range in the exponent a can be used. This is 
due to a violation of the rj t k > criterion which requires 
for larger networks either a smaller maximal degree /c max 
than the one used from Eq. (|33[) or a greater minimal 
degree fc min . Despite the restrictions which apply to the 
ansatz made, the range of correlations span very well the 
range of correlations found in empirical networks. 



B. Empirical Networks 

A very interesting aspect of our formalism is its ap- 
plicability to empirical networks. By extracting a degree 
sequence from an empirical network and employing the 
formalism developed in the last section, it is possible to 
create random networks which have the same degree se- 
quence as the empirical network and an arbitrarily chosen 
average nearest neighbor function fc nn (fc), for instance 
following a power-law with tunable exponent a. Thus, 
given a degree sequence from a network, one constructs 
from this the corresponding edge end distribution P e (k) 
and calculates then via Eq. (j2"8|) a joint degree distribu- 
tion P(J, k) with which one builds a randomized network. 
As a result, one obtains randomized versions of the em- 
pirical network with freely tunable two-point correlation 
strength, depending upon the choice of the exponent a. 
However, the range of the exponent a is limited by condi- 
tion (f27|) . In Fig. [6] (a), (b), and (c) the numerical results 
are shown for the actor-, the WWW-, and the yeast- 
network. The plot uses the same type of data-collapse 
as already presented in Fig. [3] The deviations from the 
expected constant value of 1 for the data-collapse are due 
to intrinsic correlations which arise in networks without 
neither self- nor multiple-edges and are caused by the 
maximal degree fc max in the degree sequence (see section 
IIV A[) . Especially the WWW- network is strongly affected 
by this as it has a maximal degree fc max of the order 10 4 , 
while the network size is 10 5 and hence only one order of 
magnitude greater. 



V. ANNEALED NETWORKS 

To investigate, for example, a dynamical processes on 
random networks, one typically performs the dynamical 
process on a whole ensemble of networks and computes 
averages of the observables one is interested in. The algo- 
rithm presented so far is suitable to generate such random 
network ensemble. The network itself always stays con- 
stant during one dynamical process and one refers to this 
type of network typically as static or quenched network. 
A different approach is to change the network on a certain 
time-scale during a dynamical process and then calculate 
averages over time of the observables one is interested in. 
In an extreme case, the vertices of the network are reshuf- 
fled before every microscopic step of the dynamic. Such 
changing networks are referred to as annealed networks 
(see Ref. [H HI M Hj). If the dynamic is local in 
each microscopic step (for instance a diffusion step from 
one vertex to another along an edge), it is sufficient to 
draw edges on demand only and to generate solely the 
local connections around the vertex considered. Here, 
we propose a scheme which efficiently simulates such an- 
nealed networks . The idea is to treat vertices of a net- 
work discrete while the edges are solely represented by 
an arbitrary joint degree distribution P(J, k) such that 
the connectivity structure of the network is only defined 
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FIG. 6: (color online) Data-collapse for average nearest neigh- 
bor function fc nn (fc) for the three empirical networks actor-, 
WWW- and yeast protein-interaction network. The left col- 
umn (a), (b), and (c) shows the data-collapse for networks 
generated by the algorithm, while the right column (d), (e), 
and (f ) shows the same data-collapse for networks simulated 
in an annealed manner. The statistics for each curve is 10 2 , 
10 3 , and 10 4 realizations, respectively. The different symbols 
indicate different values for the exponent a: blue circle —0.2, 
pink square —0.1, dark green triangle up 0.0, red diamond 
0.1, and yellow triangle down 0.2. 



on average. Hence, this scheme effectively simulates the 
networks connectivity structure in a mean field (MF) like 
manner. 

This is a very convenient tool as theoretical approaches 
to complex network topics are frequently based on MF 
theories. Successful examples are reaction-diffusion sys- 
tems [H, , epidemic disease sprea ding [22| , and phase 
transitions in ferromagnetic magnets [34|, to mention just 
a few examples. These theories usually describe the net- 
work topology via a statistical approach. Thus, it is de- 
sirable to numerically represent networks in a probabilis- 
tic manner as well. This allows an even better test of 
MF based theories since the network is represented as it 
is done within the theory. Furthermore, by comparison 
of quenched with annealed simulations, one can analyze 
in detail which aspects of such a MF theory are an over- 
approximation due to the MF assumption. We define 
such an annealed network to consist of a degree sequence 
{fcj} of size N and a corresponding joint degree distribu- 
tion P(j, k). Each element i of the degree sequence repre- 
sents a vertex with fcj connections. Thus, the set of edges 
is not fixed, only the total number of edges (N e — J2i 
is held constant. Whenever, for example, a dynamical 
process requests an adjacent vertex of a given vertex, the 
neighbor vertex is instantly determined by sampling one 



edge which emanates from the given vertex. This edge is 
drawn from the joint degree distribution P(j, k) and will 
instantly be removed after usage. 

This simulates a continuously rewired network which 
is only locally defined by means of one edge at a time. 
The first four steps to setup such an annealed network 
are basically the same as done for the initialization of the 
algorithm of section [TTTl (i) Draw a degree sequence from 
the joint degree distribution P(j, k) or take the degree se- 
quence from a real network. That degree sequence is (ii) 
sorted according to degree classes and (iii) mapped into a 

discrete edge end distribution Pe d \k). In the same man- 
ner as done previously, (iv) one calculates the discrete 
conditional degree distribution p( d )(j|fc) from the theo- 
retical joint degree distribution P(j,k). Now, instead of 
constructing the network, one only redefines how neigh- 
bors of vertices and hence how edges have to be under- 
stood: 

• The neighbor vertices of a vertex with degree k are 
always drawn by the conditional probability distri- 
bution p( d )(j|fc). 

• An edge is sampled by first drawing a vertex via the 
edge end distribution pj d ^ (k) and secondly the ver- 
tex neighbor is found by sampling the conditional 
probability distribution p( d '(j|fc). 

As we want the network to be free of self-connections, we 
assure that the sampled vertices at both ends of the sam- 
pled edges are not the same. However, the constraint of 
preventing multiple-edges among vertices is not possible 
to be enforced within this local definition of the network. 
Therefore, these annealed networks are free of the intrin- 
sic degree correlations which arise due to this particu- 
lar constraint. This becomes apparent in Fig. [6jd), (e), 
and (f ) where numerical results of annealed networks are 
shown as a data-collapse for the average nearest neigh- 
bor function fc nn (fc), aside with the corresponding curves 
in the case where the network is actually constructed 
(Fig.EKa), (b), and (c)). Only the curve for the WWW 
network, Fig. [|3e), deviates from the expected value of 
1 for very large degrees. This has to be attributed to 
the prevention of self-connections, which is still enforced. 
Since these vertices with a very large degree are not al- 
lowed to connect to themselves, they have to connect on 
average with vertices which have a degree below the pre- 
assigned average nearest neighbor function k nn (k), caus- 
ing some slight trend towards disassortativity. 



VI. CONCLUSIONS 

In summary, we have presented an efficient and accu- 
rate algorithm which generates networks with an a priori 
defined two-point correlation structure defined by an ar- 
bitrary joint degree distribution P(j,k). This provides 
much better null models for the investigation of empir- 
ical networks, as these are usually two-point correlated. 
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Besides the applicability to reconstruct the two-point cor- 
relations of empirical networks, we developed a formal- 
ism which allows to systematically tune the strength of 
two-point correlations in a network while preserving the 
degree distribution P(k) of a network. The two-point 
correlations are specified in our ansatz via the average 
nearest neighbor function k nn (k) which we exemplified 
by a power-law ansatz k nn (k) oc k a with the tunable 
exponent a. As two important examples, we employed 
this formalism in the cases of scale-free networks and 
empirical networks. However, as intrinsic degree cor- 
relations arise from the constraint of the prevention of 
self- and multiple-edges, these cause inevitable deviations 
from the theoretically preassigned two-point correlations. 



Furthermore, we found that the maximal cut-off degree 
&max in the case of articifial scale- free networks to prevent 
these intrinsic correlations is substantially lower than it 
was believed. 

At last, we introduced the notion of two-point corre- 
lated annealed networks which are ideally suited to test 
the validity of mean field theories, since the edges of these 
networks are solely represented in a probabilistic manner. 

Using this algorithm and the new formalism developed, 
one can investigate the effects of two-point correlations 
in empirical and artificial networks. Such scheme is ex- 
pected to be an important tool to better understand, for 
example, how the topology of a network influences dy- 
namical processes on it. 
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